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Abstract. The aim of this paper is to use non asymptotic bounds for the probability of 
rare events in the Sanov theorem, in order to study the asymptotics in conditional limit 
theorems (Gibbs conditioning principle for thin sets). Applications to stochastic mechanics 
or calibration problems for diffusion processes are discussed. 



1. Introduction 

Let Xi,X2, ... be i.i.d. random variables taking their values in some metrizable space {E, d). 
Set Mn = i X]"=i Xi the empirical mean (assuming here that S is a vector space) and 
Ln = - Y17=i empirical measure. In recent years new efforts have been made in 

order to understand the asymptotic behavior of laws conditioned by some rare or super-rare 
event. 

The celebrated Gibbs conditioning principle is the corresponding meta principle for the em- 
pirical measure, namely 

hm P®"((Xi, . ..,Xk)eB/Ln£A) = (i^THB) , 

n—> +00 

where ^* minimizes the relative entropy H(fj,* \ /i) among the elements in A. When A is 
thin (i.e. P®"(L„ £ A) = 0), such a statement is meaningless, so one can either try to look 
at regular desintegration (the so called "thin shell" case) or look at some enlargement of A. 
The first idea is also meaningless in general (see however the work by Diaconis and Freedman 
|16j^. Therefore we shall focus on the second one. 

An enlargement is then a non thin set containing A, and the previous statement becomes 
a double limit one i.e. 

lim lim . 

e n-^ +00 

Precise hypotheses are known for this meta principle ( "thick shell" case) to become a rigorous 
result, and refinements (namely one can choose some increasing k{n)) are known (see e.g. 

and the references therein). One possible way to prove this result is to identify relative 
entropy with the rate function in the Large Deviations Principle for empirical measures 
(Sanov's theorem). In this paper we will introduce an intermediate "approximate thin shell" 
case, i.e. we will look at the case when the enlargement size depends on n, i.e. e„ — > 0. We 
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shall also discuss in details one case of "super-thin" set, i.e. when relative entropy is infinite 
for any element in A. 

Of course since we are considering conditional probabilities, we are led to get both lower and 
upper non asymptotic estimates for the probability of rare events. 

The paper is organized as follows. 

In Section 121 we shall introduce the notations and recall some results we shall use repeatedly. 
Then we give the main general result (Theorem 12. 7() . 

When A is some closed subspace (i.e. defined thanks to linear constraints), our program can 
be carried out by directly using well known inequalities for the sum of independent variables. 
This will be explained in Section |31 

The more general case of a general convex constraint is studied in Section^ In the compact 
case upper estimates are well known and lower estimates will be derived thanks to a result 
by Deuschel and Stroock. In both cases on has to compute the metric entropy (i.e. the 
number of small balls needed to cover ^4) for some metric compatible with the convergence of 
measures. The extension to non compact convex constraint is done by choosing an adequate 
rich enough compact subset. 

Section O is devoted to some examples, first in a finite dimensional space. We next show 
that the Schrodinger bridges and the Nelson processes studied in Stochastic Mechanics, are 
natural "limiting processes" for constraints of marginal type. 

Section!^ is devoted to the study of a super-thin example corresponding to the well known 
problem of volatility calibration in Mathematical Finance. Our aim is to give a rigorous status 
to the "Relative Entropy Minimization method" introduced in [2]. The problem here is to 
choose the diffusion coefficient (volatility) of a diffusion process with a given drift (risk neutral 
drift), knowing some final moments of the diffusion process. Of course all the possible choices 
are mutually singular so that the constraint set A does not contain any measure with finite 
relative entropy, i.e. is super-thin. We shall show that under some conditions, the method 
by Avellaneda et altri ^ enters our framework, hence furnishes the natural candidate from 
a statistical point of view (we shall not discuss any kind of financial related aspects). 

Another famous example of super-thin set is furnished by Statistical Mechanics, namely: are 
the Gibbs measures associated to some Hamiltonian the limiting measures of some conditional 
law of large numbers ? The positive answer gives an interpretation of the famous Equivalence 
of Ensembles principle (see j22| I15j ) . It should be interesting to relate the Gibbs variational 
principle as in fT3^ to the above Gibbs conditioning principle. This is not done here. 

Acknowledgements. We want to warmly acknowledge Christian Leonard for so many ani- 
mated conversations on Large Deviations Problems, and for indicating to us various references 
on the topic. 

2. Notation and first basic results. 

Throughout the paper {E,d) will be a Polish space. Mi(E) (resp. M{E)) will denote the 
set of Probability measures (resp. bounded signed measures) on E equipped with its Borel 
cT-field. Mi[E) is equipped with the narrow topology (convergence in law) and its natural 
Borel (T-field. 
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In the sequel, we will consider a sequence Xi,X2, . . . of i.i.d. E valued random variables. 
The common law of the Xj's will be denoted by a and their empirical measure by Ln = 

Our aim is to study the asymptotic behavior of the conditional law 
(2.1) a\j,{B) = ((^1, . . . , ^fe) G / Ln G An ; 

for some An going to some thin set A when n goes to oo. 

The first tool we need is relative entropy. Recall that for (3 and 7 in Mi{E), the relative 
entropy H{P \ 7) is defined by the two equivalent formulas 

(2.2.1) H{P 17) = / log (^) d(3 , if this quantity is well defined and finite, +00 otherwise, 

(2.2.2) H{I3 I 7) = sup {/ ^dp -logfe'^dj,ipe CbiE)} . 

If is a measurable set of Mi(E) we will write 

(2.3) H{B\j)=M{H{P\-f),PeB}. 

The celebrated Sanov's theorem tells that for any measurable set B 

° 1 1 — 

-H{B I a) < liminf - log P(L„ e B) < limsup - log P(L„ e B) < -H{B \ a) , 

n^oo n n^oo n 

o 

where the interior B and the closure B of B are for the narrow topology. 
Recall that one can reinforce the previous topology by considering the G-topology induced 
by some subset G of measurable functions containing all the bounded measurable functions. 
In particular if a satisfies the strong Cramer assumption i.e G G , Vi > , 

(2.4) j e'^^da < +00, 

the previous result is still true for the G-topology (see |17j thm. 1.7). When G is exactly the 
set of measurable and bounded functions, the G-topology is usually called r-topology. 

It is thus particularly interesting to have some information on the possible Arginf in H2.3|) . 
The result below is collecting some known facts: 

Theorem 2.5. Let G be a measurable convex subset of Mi{E) such that H[G \ a) < +00. 
There exists an unique probability measure a* such that any sequence Vn of G such that 
lim„^_l_oo H{vn I a) = H{G \ a), converges in total variation distance to a* . 
This probability measure (we shall call the generalized I- projection of a onG) is characterized 
by the following Pythagoras inequality H{u \ a) < H{a* \ a) + H[v \ a*) for all 1/ £ G. 
If a* belongs to G we shall call it the I- projection (non generalized). In particular the I- 
projection on a total variation closed convex subset such that H[G \ a) < +00 always exists. 
Finally if a satisfies the strong Cramer assumption i2.4^ one can replace total variation 
closed by G-closed in the previous statement. 



All these results can be found in [HI El (see chap. II for more details). 
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Before to state our first results on thin constraints, we recall the known results on thick ones. 
Theorem 2.6. 

o 

(2.6.1) (see [10]^. If C is convex, closed for the r-topology and satisfies H{C \ a) = H{C \ 
a) < +00 then ^ defined in \2.1\) is well defined for n large enough, and converges 
(when n goes to oo) in relative entropy to a*®^, where a* is the I- projection of a 
on C . 

_ o 

(2.6.2) (see If A is a measurable subset such that H{A \ a) = II{A \ a) < +oo, and 
if there exists an unique a* £ A such that H{A \ a) = H{a* \ a), then ^ again 
converges to a* ®^ but for the narrow convergence. 

o o 

When H{A \ a) = +oo (in particular if A is empty) but H{A \ a) < +oo (thin constraints) 
we have to face some new problems. The strategy is then to enlarge A, considering some 
nice Agr, and to consider limits first in n, next in e. Here we shall consider enlargements 
depending on n. Here is a general result in this direction. 

Theorem 2.7. Let Cn be a non increasing sequence of convex subsets, closed for the 
G-topology. Denote by C = ^n- Assume that 

(2.7.1) H{C \a) < +00 , 

(2.7.2) a has an I- projection a* on C , 

(2.7.3) lim„_oo H{Cn \ a) = H{C \ a) , 

(2.7.4) liminf„_oo ^ loga®'^(L„ G Cn) > - H{C \ a) . 

Then, for all A; G N*, aji,^ ^ converges in total variation distance to a*®^. 
Remark 2.8. Define 

L^(a) = I 5 measurable : Vs G M , ^ e'*!^^! da < +oo | . 

If G C L"(a), we already know (see Theorem l2.5() that a* exists as soon as H{C \ a) is finite. 
In addition, since the relative entropy is a good rate function (according to [T7j its level sets 
are compact) ()2.7I 3) is also satisfied. Hence, in this case, assuming H{C \ a) < +oo, the 
only remaining condition to check is 

(2.9) liminf - loga®"(L„ G C„) > - H{C \ a) . 
Proof, of Theorem \2. T\ 

Let a* the generalized /- projection of q on C„. then 

(2.10) II ac^ f, - a ^ \\tv < II ac„,A; - «n \\tv + II - a ^ IItv 



< J2H i al^ ,^ I a*®''] + ^\2H [a*®^ \ al®'' 



< J2H fa" I a*n®') + ^2A;if (a* |< 



where we have used successively the triangle inequality, Pinsker inequality and the additivity 
of relative entropy. Since a* is the /- projection of a on C, a* belongs to C and all C„, so 
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that using Theorem 12.51 

H{C I a) = H{a* \ a) > H{a* \ a*J + H{Cn \ a) . 
Thanks to 1)2.71 3) we thus have hm„_»oo H{a* \ a* ) = . 

To finish the proof (according to (|2.1U|) ) it thus remains to show that hm„^oo H {oi^^ ^ \ a* 

. But thanks to H2.71 4). for n large enough, a'^^{Ln G C„) > 0, so that we may apply 
Lemma l2 . 1 1 1 b elow with A = Cn- It yields 

^(«c„,fcK''') < -^log(a«"(L„GC„)e"^(^"l-) 

< - ^ log (a®"(L„ G Cn) e"^(^l")) + k {H{C \ a) - H{Cn \ a)) . 

According to ()2.7I 4) the first term in the right hand side sum has a non positive limsup, 
while the second term goes to thanks to (|2.7I 3). Since the left hand side is nonnegative the 
result follows. □ 

We now recall the key Lemma due to Csiszar (|,1U!) we have just used : 

Lemma 2.11. Let A be a convex G-closed subset, such that H{A \ a) < +oo. Denote by a* 
the generalized I- projection of a on A. Then if a®'^{Ln G ^) > 0, for all A; G N* , 

H [aX^ I a*®') < - 1^ log G A) e"^(^l-) 

Under some additional assumption one can improve the convergence in Theorem 12.71 Intro- 
duce the usual Orlicz space 

M„) = {._ble:3,eM,/.*U„<+oo}. 

Note the difference with (for which 3 is replaced by V). We equip L,- with the Luxemburg 
norm 

II g \\t= inf {s > , J T{g/s) da < 1} where t{u) = e'"' — |ti| — 1 . 

It is well known that the dual space of LT-(a) contains the set of probability measures u such 
that H{i' I a) < +oo. We equip this dual space with the dual norm || ||*. 

Proposition 2.12. In addition to all the assumptions in Theorem \2. 1\ assume the following: 
the densities hn = fa* being the generalized I- projection of a on Cn) define a bounded 
sequence in LP (a) for some p > 1. Then 

lim II a'^ - a* ||!= 0. 

Proof. The proof is exactly the same as the one of Theorem 12.71 with k = 1, just replacing 
II \\tv by II llr i'^ first line of ()2.1U|) . and then replacing Pinsker inequality by the following 
one, available for i/j's such that H{i'i \ a) < +oo, 

du2 
da 

where q = p/{p — i), t^2 being a* and vi being either or a 



(2.13) II - V2 li; < gC (1 + log(4i/'' II ^ Hp)) [Hiy^ \ V2) + \ U2, 
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In order to prove (|2.13|) we first recall the weighted Pinsker inequality recently shown by 
Bolley and Villani jSj (also see for another approach) : there exists some C such that for 
all nonnegative / and all (5 > , 

II f^i - f^2 Wtv < {C/6) (^1 + log j e^ fdu2^ (h{ui \ v^) + \fWy^)) • 

For a / such that || / ||t-< 1 and b = Xjq it thus holds, first J e^^^da < 4, then thanks to 
Holder's inequality / €^^^^1^2 < 4"^ || ^ Hp- I^T^Tl) immediately follows. □ 



3. F moment constraints. 

In this section G = LT-(a) and we consider constraints C in the form 



C = e Mi{E) , J Fdu e 



where F is a measurable B valued map {{B, \\ . ||) being a separable Banach space equipped 
with its cylindrical cr-field) where J Fdv denotes the Bochner integral and is a closed 
convex set of B. We denote by 

yXeB', Zf{X)= [ exp{{X,F{x)))a{dx) , Af{X) = log ZpiX) 
Je 

the Laplace transform and moment generating function of F. 
We always assume that 

• II F II G L^(a), 

• domAp = {A G i?' , Ap{X) < +00} {B' being the dual space of B) is a non empty 
open set of B' . 

The enlargement C„ is defined similarly 

C„ = ji/ G Mi{E) , j Fdv € K'"^ 
for = {x E 5 , d{x, Cn) < Sn}. 

What we have to do is to check all the assumptions of Theorem 12.71 But the situation here 
is particular since the condition L„ E A reduces to Y17=i ^i-^i) ^ ^- Thanks to the next 
Lemma l3. II assumption ()2.7I 4) reduces to well known estimates: 

Lemma 3.1. Assume that the I- projection a* of a on C exists and can be written a* = 
^^p(^'x*) ^ f'^''" some X* £ B' . Then for all e > 0, 



- log G C,)e"^(°'l°)) > ^ logP ( - V F{Yi) - [ F da* 

where the K- 's are i.i.d. random variables with common law a*. 



< £ 



A* II e, 
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Proof. The proof uses the standard centering method in large deviations theory. Denote by 
-^n = n X]r=i empirical measure of x = (xi, . . . , Then 



/n , 

i=l 



-nH{a* \a) 



-nH{a*\cx) 



IcAL^) exp ( -n ( L?^ - aMog ^ ) ) da*^^{x) 



j IcAL^J exp (^-n(^X\^pj{xi) - J 



Now we may replace Ce by its subset 

Ce = |i/ G Mi(£;) , y \\ F \\du < +00 and 

and obtain 

a' 



Fdv - / Fda* 



Fda* ) da*®"(x) 



< e 



that completes the proof. 



□ 



The next Lemma 13.21 is well known in convex analysis. For a complete proof the reader is 
referred to ^TW Lemma n.39, 

Lemma 3.2. Under our hypotheses on F and domAp, if in addition the function 

H{\) = Af{X)- inf(A,y) 

achieves its minimum at (at least one) A*, then H(C \ a) = supx(^B' {^^^yeK{^,y) — Ai?(A)} 

(A* F) 

and the I- projection a* of a on C exists and can be written a* = 'zpi'^*) 

In the sequel we shall denote (H-K) the additional assumption on H. In particular if K = 

{xq} with xo = VAi7'(Ao) (H-K) is satisfied. 

Before to state our first general result let us recall some definition 

Definition 3.3. B is of type 2 if there exists some a > such that for all sequence Zi oflJ^ 
i.i.d. random variables with zero mean and variance equal to 1, the following holds 

2\ 



E 



1=1 



<a^E(||Z, f). 



1=1 



In particular an Hilbert space is of type 2. 
We arrive at 
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Theorem 3.4. In addition to our hypotheses on F and dornKp, assume that B is of type 
2 and that (H-K) is satisfied. If £n > with c = \J a Vara*{F), then ^ converges to 

a*^^ in total variation distance when n ^ oo. 

Proof. H2.7I 1) and ()2.7I 2) are satisfied with our hypotheses, according to Lemma 15?^ 

In order to prove 1)2.71 31 introduce the function Hn similar to H in Lemma 13.21 replacing K 

by K^" . Of course 

infF < iniHn < Hn{X*) > iniH = H{X*) 

since Hn converges to H pointwise on the domain of H. We already know that infiJ = 
—H{C I a). It is thus enough to prove that inf Hn = —H[Cn \ a)- But this is a consequence 
of Csiszar results (jH] thm 3.3 and ^Uj thm 2 and 3, also see [T^ thm 11.41 for another proof) 
since the intersection of the interior of and the convex hull of the support of the image 
measure F~^a is non empty. 

Finally in order to prove (|2.71 4). according to Lemma l3. H it is enough to check that 



lim - logP f - V F{Yi) - I 
\ 1=1 

To this end recall the following theorem of Yurinskii 



Fda* 



0. 



Theorem 3.5. (Yurinskii, |23j theorem 2.1). If Zi is a B valued sequence of centered inde- 
pendent variables such that there exist b and M both positive, with 

ViEN*,VA;>2, E(|| f ) < — 6^M*="2^ 



then denoting Sn = Y17=i ^« holds 

Vt>0, P(|| Sn ||>E(|| Sn \\)+nt) < exp 



1 



nt^ 



8 b^ + tM 



We may apply Theorem ESI with Zi = F{Yi) - f Fda*, M = \\ F - J Fda* 
V2M as soon as F E L^(a*). Indeed since B is of type 2, E(|| Sn \\) < y/^iW Sn P) < 

ay/a 



and b 



ana 



with a = y/E{\\ Zi P). It follows 

I Fda* 



i=l 



< 



n 



(l+t) > 1 



exp 



1 



aaH'^ 



8 2M2 + tM 



and the result provided en\fn > a^fa. 

It remains to prove that F E L^(a*). But thanks to the representation of a* obtained in 
Lemma 13.21 

1 



J\\F\ 



da* 



< 



Zf{X*) 
1 



Zf{X*) 

Since domKp is a non empty open set containing A*, there exists some p > 1 such that 
p\* E domKp, and the result follows for t small enough since F E LT-(a). □ 
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Remark 3.6. Note that if for instance F is bounded everything in Theorem 13.41 can be 
expUcitly described with the only parameter n. However (unfortunately) we do not know 
any explicit bound for the speed of convergence of a^^ ^ , because we do not know in general 
how to evaluate H(Cn \ a) — H(C \ a). Hence from a practical point of view, if we know 
how to enlarge C, we do not know when a possible algorithm has to be stopped. 

It is natural to ask whether e„ ~ the optimal order for the enlargement or not. In 

one dimension the answer is negative as we shall see below 

Theorem 3.7. If B = M. the conclusion of Theorem \3.4\ remains true for En > c/n for some 
c large enough. 



Proof. We shall just replace Yurinskii's estimate by Berry-Eessen bound. Indeed Berry- 
Eessen theorem tells us that 



> $ 



£n\/n 



a 



20 



where ^{u) = e ^'^/^(is/V^Tr, = Var^'F and k is the a*'s moment of order 3 of 
F — J Fda* . It easily follows that 



> 



2 / ne 



10{K/a^) = 9. 



In Vv27r 

The requested 1/n log0„ follows with e„ = c/n provided c > 10\/27r(K/(7^). 



□ 



Again one may ask about optimality. Actually it is not difficult to build examples with 
= d jn for some small d such that P(L„ G C„) = for all n. In a sense this is some proof 
of optimality. But we do not know how to build examples such that the previous probability 
is not zero. 

Finally we may improve the convergence, still in the finite dimensional case under slightly 
more restrictive assumption. 

Theorem 3.8. In Theorem \S.J\ assume that B = and replace the hypothesis (H-K) by 

o 

the following : K D S ^ f/> where S is the convex hull of the support of the image measure 
F~^a. Then ^ converges to a*®^ both for the dual norm || ||* and in relative entropy. 



Proof. The first point is that the new hypothesis is stronger than (H-K). Indeed it is known 
(see e.g. |12) or |19j Lemma III. 65 for complete proofs) that not only (H-K) holds (as well 
as (H-i^^") of course), but the minimizers A* and A* are unique and A* A* as n ^ oo. 
Hence H{Cn \ o) H{C \ a) too. 

Next / {j^^ da = ^^^^x*") " Since A* is a bounded (convergent) sequence, the above 
quantity can be easily bounded for some p > 1 (using again the fact that domKF is an open 
set). Convergence for the dual norm || ||* follows from Proposition 12.121 
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Finally using exchangeability we have 

= H{a^^^, I a*^'') + kHia* \ al) + k j log ^ (daj„ - da*) . 

We already saw in the proof of Theorem 12.71 that H{a* \ a*) and H{a^^^ \ a*^^) go to 
0. It remains to prove that / log ^ [da^^ - da*) goes to 0. But log ^ = (A* - \* ,F) is 
bounded in LT-(a) for n large enough since A* goes to A*. Hence convergence to of this last 
term follows from the convergence for the dual norm || ||* we have just shown. □ 

Remark 3.9. In Theorem 13.81 one can also replace Yurinskii's bound by the classical Bern- 
stein inequality (see e.g. [T^l. This only improves the constants (see jl9| for the details). 

The results of this Section are satisfactory mainly thanks to Lemma f^.ll and the very complete 
literature on sums of independent variables. The situation is of course more intricate in more 
delicate situations. We shall study some of them in the next sections. 

4. General convex constraints. 

We start with the key minimization bound we shall use. The following result is stated in |14j 
Exercise 3.3.23 p76. A complete proof is contained in [20] (also see PiHl). 

Proposition 4.1. Let A C Mi{E) he such that {x , € A} is measurable. If v is such that 
i/ <C a and u®'^{Ln G A) > 0, then 

ilog(a«"(L„G^)e"^H")) > -H[u \ a) ^^3" + -log^^"(L. £ A) 

1 

~ nei^®'^(L„, G A") ' 
Corollary 4.2. // \2.7\ 1,2,3) are all satisfied, \2.1\ 4) holds as soon as 

lim a*®"(L„ E C„) = 1. 

The proof is an immediate application of Proposition 14.11 with A = Cn and v = a* since 
H{C I a) = H{a* \ a). 

In the remainder of the section we shall assume that G = Cb{E). According to Remark 
12.81 it is thus enough to check (|2.7I 1 and 4) in order to apply Theorem 12.71 In particular if 
H{C I a) is finite, it just remains to check the condition stated in Corollarv 14.21 by choosing 
appropriate enlargements C„. To this end we first recall basic facts on metrics on probability 
measures. 

Recall that the narrow topology on Mi [E) is metrizable. Among admissible metrics we shall 
consider two, namely the Prohorov metric dp and the Fortet-Mourier metric dpM- 

Proposition 4.3. For two probability measures vi and U2 on E the previous metrics are 
defined as follows 

dp{ui, V2) = inf{a > : sup(i/i(A) - V2{A"')) < a where A" = {x : d{x, A) < a}} , 

A 
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d.M(.., = sup - d.,) /or / 6 BLr,iE) suck tkat || / \\su,< l} , 

where BLip is the set of bounded and Lip schitz functions and \\ f \\BLip=\\ f ||oo + II / \\Lip ■ 
For both metrics Mi{E) is Polish. If in addition E is compact then so does Mi{E). 
Furthermore the following inequalities are known to hold 

dpMij^i, 1^2) <\\ i^i - 1^2 \\tv and dp{ui,U2) < ^ \\ ui - 1^2 \\tv , 

and 

(p {dp {1^1, 1^2)) < dFM{vi,V2) < 2dp{vi,V2), 

where (p{u) = 
In the sequel 

Cn = = {iy:d{C,u)<en} 
where d is one of the previous metrics. 

Definition 4.4. Let {X, d) a metric space. If A (1 X is totally bounded, we denote by 
Nx{A,d,e) the minimal number of (open) balls with radius e that covers A. The function 
Nx is often called the metric entropy. In the sequel we simply note N{d,e) the quantity 
NxiX,d,e), if X is totally bounded. 

Our first result is concerned with compact state spaces. 

Theorem 4.5. Assume that E is compact. Let C be a narrowly closed convex subset of 
Mi{E) such that H{C \ a) < +00, and a* be the I- projection of a on C. Then for any se- 
quence Sn going to and such that N{dFM ,£n/4:) e~ "'^"^^ — > (resp. A^((ip, e„/4) e~ "■'^"/^ 
0) as n ^ 00, ^ a*^^ in total variation distance. 

Proof. Let B{a*,e) the open ball centered at a* with radius e. Then 

a*®"(L„ e > a*®"(L„ G B{a*,e)) = 1 - a*®"(L„ G B'{a*,£)) 

where B'^ is as usual the complement subset of B. But we can recover B'^{a*,£) by 

NM,iE){B'{a*,e),d,v)<N{d,rj) 

closed balls with radius r/ so that 

a*®"(L„ G B%a\e)) < N{d,r]) maxQ*®"(L„ G Bj) 

j 

for such balls Bj. But a closed ball being closed and convex, Lemma 12.111 shows that 

a*®"(L„ G Bj) < e-"-f^(^jl°*). 
Since Bj C (S^(a*, e))^'' we have H{Bj \ a*) > H {{B%a* , e)f'^ \ a*) and finally 

Choosing rj = e/4, hence {B^{a* ,e))'^'^ = B'^{a* ,e/2), we may apply the results recalled in 
Proposition 01 to get that for all G B''{a*,e/2), 

H{u\a*)>^\\u-a* |||^> ^ d^Mi^, a*) > . 

We can replace 8 by 2 when replacing the Fortet Mourier metric by the Prohorov one. 
Hence we may apply Corollary 14.21 and Theorem 12.71 □ 
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The condition on e„ in the previous Theorem is interesting if it can be satisfied by at least 
one such sequence. The following proposition shows that it is always the case, it also relies 
the metric entropy on Mi{E) to the metric entropy on E. 

Proposition 4.6. Let (E, d) be a compact metric space. Then for all e > 0, 

(4.6.1) N{dp,e) < (f )^(''^^ , 

(4.6.2) N{dpM,e) < 

(4.6.3) there exists at least one sequence e„ going to and such that 

hm f ^ + (log Sn) N{d, Sn/S)) = +00 . 



Such a sequence fulfills the condition in Theorem \4.5\ for both metrics on Mi{E) (but 
is not sharp). 

Proof. The first result is due do Kulkarni-Zeitouni (j2j Lemma 1), the second one follows 

thanks to Proposition 14.31 

Consider 

8(loge)7V(d,e/8) 



/ :]0, 1] ^ M+ , e 



r2 



which is clearly decreasing with infinite limit at 0. Let Un a ]0, 1] valued non increasing 
sequence, Wn = f{un) is then non decreasing with infinite limit. Introduce for n large enough 
kn = max{A; G N* , s.t.Wk < \/n}. 

• If for all n large enough, kn < n, we choose = u^.^ for all n G [kn,kn+p^[ where 
Pn = inf{p > 1 , kn+p > kn}. On one hand ne^ > knu\^ goes to infinity. On the 
other hand, 

ne^ + 8(loge„)iV((i,e„/8) = ne^(^l-^) > ne^ M - ^ J ^ +cx) . 



• If not, there exists some sequence pj growing to infinity such that kp- > pj, i.e. 
Wp^ < ^/Pj. Define ip{n) as the unique integer number such that n S [Pip(n)TPip{n)+ih 
and choose = • Then ne^ > S^es to infinity and 

nel + 8(log en)N{d, e„/8) = nel ( 1 - > nel ( 1 - — ^ ) ^ +cx) . 

The final statement is a consequence of the previous ones. The proof is thus completed. □ 

Example 4.7. If is a g dimensional compact riemanian manifold, it is known that 
N{d,e) < Ce/s^ for some constant Ce. In this case we may thus choose £n = l/n"" for 
all < a < The size of enlargement is thus much greater than for F-moment con- 

straints. 



When E is no more compact, but still Polish, it can be approximated by compact subsets 
with large probability. Here are the results in this direction 
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Theorem 4.8. Let C he a narrowly closed convex subset of Mi{E) such that H{C \ a) < +00, 
and a* he the I- projection of a on C . Assume that there exist a sequence {Kn)n of compact 
suhsets of E and a sequence {rjn)n of non negative real numhers such that 

nr]l + 8{logr]n)NEiKn,d,rjn/8) +00 

as n —> cxD. Let e„, = rjn + 2a* (K^). If one of the following additional assumptions 

• log is continuous and hounded, and lim^^oo = 1 . 

Then ot% , — > a*®^ in total variation distance. 

Here again the conditions are not sharp, but they hold for both the Prohorov and the Fortet 
Mourier metrics. 

Proof. The proof lies on the following Lemma 
Lemma 4.9. For all compact suhset K and all rj > 0, 

a*'^'' {d{Ln,C) <r] + 2a*{K')) > (a*(K)r (l - (16e/r?)^^(^''^''^/8) e" "'''/^) . 

Proof, of the Lemma. Introduce = ^J^^^ a* . Then 



d{a*j^,a*) < \\ a*j^ — a* \\tv = J 



1 



da* < 2a*{K'' 



a*{K) 

so that according to the triangle inequality d{v,a*) < d{a*j^,i') + 2a* {K'^) for all u. Hence 
B(a*j^,v) ^ {i^, d{iy,C) < r] + 2a* {K")} and 

a*®"(d(L„,C) <?7 + 2a*(i^^)) > a*'^''iLn G B{a*K,v)) 

> a*®''{Ln e B{a*K,ri) and x € K") 

> {a*iK)ra*i^iLnGB{a*^,v)). 
As in the proof of Theorem 14.51 and using (|4.61 1 or 2) we have 

af"(L„Gi?«,r/)) > 1 - 7Vjv,,(;,)K 77/4) e" '^"'/^ > 1 - (16e/7?)^^(^''?/8) e-"'''/^ . 



□ 



The first part of the Theorem is then immediate. 

- l^rr 

da 



The second part is a little bit more tricky. Let h = log ^j— . For all e > 



> e-"-^(^l")e-"'^(")a*®"(L„ G B{a*,e)) 

where A(e) = supj^g^(„*^£-)(z^ — a*,h) . Since h is continuous and bounded, it is immediate 
that A(e) goes to as e goes to 0. Hence if e„ goes to 

liminflog (a®"(L„ e C7„) e"^^^!")) > liminflog (a*®"(L„ e B{a*,en))) • 
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Thus if we choose as in the statement of the Theorem, the right hand side of the previous 
inequahty is greater than 

hminf (^oga*{Kn) + ^ log (l - (16e/r?„)^^(^"''^'''"/8) g""''"/^)^ = 

and we may apply Theorem 12.71 □ 

In the next section we shall study some typical examples. 



5. Examples. 

In Section El we already discussed the examples of F-moments. In this section we shall first 
look at the finite dimensional situation, then study examples in relation with Stochastic 
Mechanics. 

5.1. Finite dimensional convex constraints. 

Proposition 5.1. If E = W^, let C be a narrowly closed convex subset of Mi{E) such that 
H{C I a) < +CXD, and a* be the I- projection of a on C. Then ^ — > a*^^ in total 

variation distance with = 2/n and < b < provided there exists a > q such that 
J II x ll'^ da* < +00 (that holds in particular if J e'^'I'^l'^da < +oo for some X> 0). 
In addition if either J e'^H^H^da* < +oo for some X > 0, or log is bounded and continuous, 
we may choose b < 

Of course in general hypotheses on a* are difficult to check directly. That is why the a 
exponential integr ability is a pleasant sufficient condition. 

Proof. Let M = / || x ||" da*. For Kn = 5(0, n") we have 
provided au > 1. In addition 

NEiKn,d,r]/A8) < M'n^Vr?? 

so that if = 1/n'' with 6 > 

nril + 8{logrjn)NE{Kn,d,7jn/8) > ni-2fe (^^ _ 86Af'(log n)n"'^+''(2+«)-i) 

l—u 1—- 

goes to +00 as soon as 6 < ^j^^ if ^ < since au > 1. We may thus apply Theorem 
|221with En = (1/n^) + 2(M/n"") < 2(l/n^) for n large enough. 

If the a* exponential integrability condition is satisfied we may choose a as large as we want. 
If log is bounded, a*{Kn) growing to 1, the condition tta > 1 is non necessary. □ 



DEVIATIONS BOUNDS 



15 



5.2. Schrodinger bridges. In this subsection and the next one E = C^{[0,1], M) where 
M is either M'^ or a smooth connected and compact riemannian manifold of dimension q. 
E is equipped with the sup-norm and for simphcity with the Wiener measure W (i.e. the 
infinitesimal generator is the Laplace Beltrami operator), with initial measure ^o- 
An old question by Schrodinger can be described as following (see UHl for the original sentence 
in french). Let (-'^j)j=i,...,n be a n-sample of W. Assume that the empirical measure at time 
1 (i.e. Ln(l) = ^ X]j=i ^Xj{i)) is far from the expected law /ii of the Brownian Motion at 
time 1. What is the most likely way to observe such a deviation ? Clearly the answer (when 
the number of Brownian particles grows to infinity) is furnished by the Gibbs conditional 
principle : the most likely way is to imagine that any block of k particles is made of (almost) 
independent particles with common law W* which minimizes H(y \ W) among all probability 
measures on E such that W o X^^(O) = ^uq and W o X^^(l) belongs to the observed set of 
measures. If the observed set is reduced to a single measure (thin) a double limit formulation 
of this principle is contained in the first chapter of . 
To be precise introduce for e > 

(5.2) C^(z.o, vi) = {V G Mi{E) s.t. d{Vo, i^o) < e , d{Vi,ui) < e} , 

where Vt denotes the law V o X~^(t). When e = we will not write the superscript 0. We 
are in the situation studied in the previous section since C(fOi i^i) is a narrowly closed convex 
subset of Mi{E). We shall write W* the /- projection of W on C (without specifying unless 
necessary the initial and final measures) when it exists. 

Before to apply the results in Section ^ we shall recall some known results about C and W* . 
Denote by Vu,v (resp. Wu,v) the conditional law of V knowing that X{0) = u and X{1) = v, 
i.e. the law of the V bridge from u to v. Also denote by fo,i (resp. ^o,i) the V (resp. W) 
joint law of X{0),X(1). The decomposition of entropy formula 



^(V I W) = Hiuo,i I A^o.i) + J H{Vu,v I Wu,v)dvQ,i{u,v) , 

immediately shows that, if it exists, 

W* = y Wu,.dixl^{u,v), 

where //q ^ is the I- projection of /io,i on 

Il{vo,ui) = {l3e Mi{M X M)s.t.Po = z^q , A = i^i} , 

if it exists. In other words the problem reduces to a finite dimensional one, i.e. on M x M. 
The following Theorem collects some results we need 

Theorem 5.3. Assume that H{vq \ ^uq) and H{vi \ /ii) are both finite and that p = 

ditom^i) ^ ^^('^0 ^i)- '^^^'^ H{Il{i^o, i^i) I Mo,i) is finite. 
In addition (u, v) = f{u)g{v) for any pair of functions {f,g) satisfying 

.... / ^^{n) = fiu)fp{n,v)giv)d^i{v) 

^ ^ \ §^{v)=giv)Jp{u,v)fiu)dno{u) ■ 
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The proof is contained in |S] Proposition 6.3 and jl8| p. 161-164. 

Finally under the assumptions of Theorem 15.31 

dW* 

^ = f{Xmg{X{l)). 

We can now state 

Theorem 5.5. Under the assumptions of Theorem \5.'A 

Wl^, :=£(Xi,...,X,/L„GC^"(z.o,i^i)) - W*®^ 

in total variation distance for all sequence En going to such that the following holds : for 
all sequence {Yj)j (resp. {Zj)j) of i.i.d. random variables with law uq (resp. vi) , 

lim P((J(L^,fo) < £«) = 1 and lim F{d{L^,ui) < e„) = 1 . 

n^oo n— +00 

In particular the above convergence holds for instance in the following two cases 

• M is compact and ne'^ + 8{logSn)NM{d,en/8) +oo, 

• M = R'^ , there exists a > q such that for z = 0, 1 , / || x ||" dui < +oo , £n = 

1-3- 

and b < ttt^. 

For the proof just apply Corollary 14. 21 and for the examples Prop osition 14 . 61 and Proposition 

o 

5.3. Nelson processes. A natural generalization of the framework of Subsection 15.21 is to 
impose the full flow of marginal laws instead of only the initial and final ones. Building 
diffusion processes with a given fiow of marginal laws is the first step in Nelson's approach 
of the Schrodinger equation. The problem was first tackled by Carlen [1]. Relationship with 
minimization of entropy was first observed by H. Follmer ([IB]) and explored in details in a 
series of papers by C. Leonard and the first named author ([HIIZIIHI)- This approach and the 
results below can be viewed as some "statistical mechanics" approach of quantum mechanics. 
We shall not discuss further the meaning of the previous sentence here. We prefer insist on 
the enormous difference between a pair and the flow of all marginal laws. 
Hence here 

C{ut) = {V G Mi{E) s.tM e [0, 1] , Vt = ut} , 

and for e > 

C'{vt) = {V G Mi{E) s.t.d{V,C{ut)) < e} . 
For simplicity we shall only consider the case M = W^ (though a similar discussion is possible 
for a general connected and compact riemannian manifold). Not to lose sight of our main 
goal we first state the convergence result we have in mind, and will discuss the hypotheses 
later on. 

Theorem 5.6. Assume that C{vt) is non empty and that W has an I- projection W* on 
C{vt), such that log is bounded and continuous. Assume in addition that the initial law 
liQ has a polynomial concentration rate i.e. fio{B{0, R)) < C / for some m > and all 
R> 0. Then if En = l/(logn)''' for some r < l/2q , 

in total variation distance. 
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Proof. According to Theorem 14.81 it is enough to find a sequence Kn of compact subspaces 
of E and a sequence r]n of positive numbers going to such that 

hm W*{Kn) = 1 and hm (nr/^ + 8(log A^s(i^„, || \\oo,Vn/8)) = +oo . 

Since is bounded by some , we may replace the first condition by hm„^oo VV(ifn) = 1 
and choose e„ > rjn + 2e^yV{K^). The most natural way to choose such compact sets is to 
use Kolmogorov regularity criterion. Since the support of W is included into the set of Holder 
paths of order /? < 1/2 introduce 

K{R,M,f3) = \w e Es.t.\w{o)\ < R and sup ^^^^^^^^^^ < M I , 
[ s^tem \s-tf J 

for R, M positive and f3 < 1/2. Kolmogorov's criterion tells us that 

W(i^^(ii,M,/5)) < f,o{B{0,R)) + C{p,p)M-P 

for all p > 1. In addition, thanks to Theorem 2.7.1 p. 155 in 

NE{K{R,M,f3),\\\\o,,v/8) < c,{P,q){8R/7iye''^MiM/v)''^\ 
Choosing Kn = k{Rn, Mn,r]n) with 

Rn = {alognf/'i"' M„ = (ftlogn)^/'? ??„ = (clog n)-^/"? 
we see that nr/^ + 8(logr/„) N^iKn, \\ ||oo)^n/8) is less than 

(2/3 
Ai + A2 log(clogn) (logn)^+T ^c2(/3,g)fec-i 

for some Ai and A2 independent of n. Choosing b in such a way that C2(/3, q)bc — 1 < we 
obtain a leading term going to +00 as n goes to 00. 
Putting all this together, we get 

7]n + 2e^W{K^) < (clogn)-^/« + 2Ce^(alogn)-^/'? + 2e^C(p,/?)(61ogn)-^P/'? 

which is less than (logn)"^'/'' for all /3' < 1/2 and n large enough. □ 

Remark 5.7. The assumption log bounded and continuous is essential. Indeed without 
it Theorem 1121 requires (>V*(K„))"' goes to 1, i.e. W*{K^) = o{l/n). Assuming that be- 
longs to Lr(W), Kolmogorov criterion yields M„ of order n". It is then easy to see that this is 
no more compatible with any choice of rjn such that lim^^oo [nr]'^ + 8(log ry„) NE{Kn, \\ \\oo, Vn/^)) 
+00. 

To conclude this subsection let us say a few words about our assumptions. 
First of all C{vt) is non empty as soon as vt satisfies a Fokker-Planck equation with a drift 
B(t,X{t)) of finite energy (i.e. /q / B^(t,x)dutdt < +00) see jHEHI]. In addit ion Girsanov 
theory is still available (see |S1 [7j for the details) so that 

^ = 1^ exp Bit, w{t))dw{t) - 1/2 £ \B{t, w;(t))pdt 

where T = inf{s < 1 s.t. \B{t, ■w{t))\'^dt = +00}. In general this density (even when T = 1) 
is not continuous. 

Nevertheless some interesting cases enter the framework of Theorem 15. HI 
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Let [/ be a C| potential. Then the law Vq of the unique strong solution of 

dXt = dWt-VU{Xt)dt , C{Xq) = uq 

satisfies 

^ = 1^ exp {u{wm - U{w{l)) - 1/2 l\\VU\' - AU){t,w{t))dt^ . 

Hence log is bounded and continuous as soon as log ^ is. In addition Vq is the I- 
projection of W on C(fi) where vt = ^{^t) (see 0). The conclusion of Theorem 15.61 is 
thus available for Vq. If we replace M"^ by a compact manifold we may include the stationary 
(actually reversible) case i.e. uq = e~'^^dx/Zu. 

6. A super-thin case: volatility calibration. 

In subsections 15.21 and 15.31 we have studied the laws of some diffusion processes from the 
point of view of /- projections, hence we only allowed a change of drift. We shall now study 
the opposite situation: the drift being fixed, how to choose the diffusion coefficient. We 
thus immediately lose any kind of absolute continuity, introducing a new difficulty that is 
super-thin subsets. Let us describe precisely the problem. 

Consider a family (indexed by continuous time-space functions cr) of S.D.E. 

(6.1) VtG[0, 1], dX{t) = a{t,X{t))dw{t) +boit,X{t))dt ; X(0) = 0, 

where ii; is a standard Brownian motion. We assume that bo is continuous and bounded and 

< (Jmin < O" < (Tmax < +00 

for some real numbers amin and (Tmax- Under this assumption, it is well known that (|6.1j) 
admits weak solutions and that there is uniqueness in law. We will denote in the sequel Qo-, 6o 
the probability measure on Q = C([0, 1],M) thus defined by 1)6. If) . 

In |2j the authors addressed the problem of calibrating a (volatility in mathematical finance) 
when 6o is known (a consequence of the "absence of arbitrage") and X satisfies a set of 
generalized moment constraints 

(6.2) ¥.[fj{tj,X{tj))]=Cj , jG A, A finite. 

Their strategy is based on the following Bayesian principle : take a prior (Tq, the corresponding 
prior law of X is Qo-q,6q. Then the "most probable" P satisfying (|6.2() . will be the one which 
minimizes the relative entropy H{¥ \ Q.ao,bo)- Of course this principle is meaningless here. 
Indeed, the finiteness of H{¥ \ Qo-o.feo) implies that P has the same diffusion coefficient as 
Qo-o,feo' hence there is no such P satisfying 1)6. 2(1 unless Qo-o.feo does. To bypass this difficulty, 
the authors propose to approximate Qo-o,feo by some well chosen Q^^ (actually various time 
discretization), in such a way that £H{¥^ \ Q^^ goes to some limit K{¥ \ Qa-o,feo)' ^"^^ 
then use K as the cost function to be minimized. 

We shall interpret this strategy in the following way. 

For simplicity assume that the set of constraints is reduced to a single one i.e. introduce the 
set 

Cf = {P,Ep[F(X(l))] = 1} 
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where P describes the set of Probabihty measures on J7 = C([0, 1],] 
before some e enlargement oi Cp, i.e. define 



We will choose as 



j F{X{l))dF - 1 



< £ 



Again for simplicity, we shall assume that 5(i, x) = bo for some 5o > (extensions to more 
general cases can be easily done). We also define 

So = {o- : [0, 1] X M -^](Tmin,crmax[, continuous} 

and for e < bo, 

Be = {b : [0, 1] X M -^]bo — s,bo + £[, continuous} . 

Let us precise that the space of space-time continuous functions C([0, 1] x R, M) will always 
be furnished with the topology of uniform convergence on every compact subset of [0, 1] x R. 
Now we introduce a standard approximation of Qo-, bo ; namely the trinomial tree. 
Choose some a > (Jmax and < s < 6o- For (y, z) G we define 

For n large enough (> no), it is easily seen that for all (y, ^) € [o"mm, Cmax] x [&o — s, 6o + 
the vector (m"", d"", r") has all its entries strictly positive (their sum being 1), so that we 
may define the following transition kernel defined on M for all (a, 6) G Sq x ^S^, n > no and 
(t,x) G [0,1] X R, 

n^, b(t, X, . ) = (a, b) it, x) .6^+^+ {a, b){t,x).6^ + (a, b) {t, x) .(5^_ « . 

We thus define the probabihty measure ^ 
f (1) Q« (Xo = 0) = l, 



(6.3) 



(2) Q^, Xt=X. +(nt-fc) 



' n — — 



(3) Q^,JX^G. x.,...,Xo)=n-J^,x.,.) 



1, 



In the sequel, wc will denote by E" the expectation with respect to the trinomial tree 
The support of is C defined by 

-w(0) = 

-^(^)-^(^)e{-^'0.^}' Pour^ = 0,...,n-1 
—Lo affine on ^] , pour i = 0, . . . , n — 1 

The set r2„ is finite with cardinality 3". 



o; e O : 



Finally denoting by = ^ T^=\ ^i^i empirical measure on we shall study 
defined by 

R^'-(5) = (Q^„,,J®'"(a;i G i?/L„ G f « n C^) , 
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where T" will be defined later. Let us just say for the moment that T" is an open set of 
Mi(Q„) which contains all the trinomial trees with o" in a totally bounded subset Si 
of So and b G B^. Roughly speaking, for each level of approximation (n) we consider a m 
sample of the trinomial tree and look at the conditional law of the first coordinate, knowing 
that the empirical measure is not too far from being a trinomial tree satisfying the moment 
constraint. 

Our aim is to show that one can find sequences £n going to and rUn going to infinity, such 
that M"^'"" goes towards some Qo-*,6o' '^^^ proposed in [2] we will now describe. 

First, for fixed n and e, since all measures are defined on a finite set, it is not difficult to 
see that the set A^" of minimizers of H{ . \ Q"^ on T" n Cp is nonempty. It can then be 
shown that the elements of are still a trinomial trees. Now an easy computation shows 

that a ^ — if(Q" | Q"^ f,^) is converging (in a sense close to the F-convergence sense) to 



One thus expects that the limit Qo-*,feo the one obtained by minimizing / on So under the 
moment constraint. 

The remainder of this section will be devoted to give rigorous statements and proofs. Note 
that the result gives a rigorous statistical flavor to the method proposed by Avellaneda et 
altri. 

6.1. Presentation of the results. We recall that the space C([0, 1] x M,M) is equipped 
with the topology of uniform convergence on every compact subsets of [0, 1] x R. Before 
presenting our results, let us state the basic convergence property of trinomial trees : 

Proposition 6.4. // s > e„ > goes to zero and an G So goes to cr G So then, for all 
bn £ Be„, the sequence Q"^ 9oes to Qa,bo- 

From now on, we will make the following assumptions : 

• The minimum value of the function /( . | do) on the set {ct G So : / F{Xi) dQ^, fe = l} 
is attained at a unique point a* . 

• The minimizer a* belongs to Sq. 

Now let us introduce some notations. For all a G So, let A„_o- be the continuity modulus of 
a on the compact set [0, 1] x [—a^/n,a^/n], ie. 



An,o-(e) = sup < |(T(t, x) — (7(s, y) | I s, t G [0, 1] , X, y G \—a^/n, a^/n\ ^\t — s\ + \x — y\ < e 



n 




with 




Let Si be defined by 



Si = {aG So :VnGN*, A„,<, < 2A„,,.}. 
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According to Ascoli Theorem, Si is easily seen to be totally bounded. 
Now let us consider the set T" of all probability measures Q on satisfying 

' (1) Q{Xo = 0) = 1, 

(6.5) J (2) Q(Xt = X,+{nt-k)\x^-X,],^<t<^) = l, 

(3) 3(a,b) G Si X fie such that Q (x^+i e . Xp) =W: Js^,Xp, 



In the sequel we will set A" := T" n Cp. Defining (when possible), for all positive integer m. 



our main result is the following : 



Theorem 6.6. If = min ^ E", [F (Xi)] — 1 + 1/n, sj , then there exists a sequence m„ 
of positive integers going to +oo, such that M"o converges to Qo-* bo- 

In order to prove this theorem, the first step is to study the convergence of M"o ^ when n is 
fixed and m goes to +cxd. This is done in the two following propositions : 



Proposition 6.7. Recall that dpM denotes the Fortet-Mourier distance, and for all e > 

let be the set of minimizers of H{. \ Q^^ ^^) on A^. Then, 

dpM^l^ m^aiMlo) ^0, 

where cdA4% denotes the closed convex hull of M% . 

Proof. The set A"o is non empty (it contains Q"* ^^^) and, according to the proposition below, 

it is open and satisfies -ff(A"'o | (,,) = H{A% \ . ) . The result follows immediately 
from the classical Gibbs conditioning principle. □ 

Proposition 6.8. 

(1) The set is an open subset of Mi{VLn), and satisfies H{A^ \ Q^^^bo) = I 

^"o.feo)- 

(2) Every element of is of the form ^ for some (a, b) £ Zi x B^. 

According to Proposition 16.71 we know that for large m, M"o is close to 'coM\ ■ The next 
step consists in proving that this set is close to {'Q.a*,bo}- This will follow from the particular 
type of convergence of the normalized entropy functions : 

Proposition 6.9. 

(1) If < En goes to 0, then for every sequence bn G Be„, and for every a G Sq, the 
following holds : 

— ^ /(cr I (To). 

(2) Furthermore, if an G Sq converges to a £ Sq, then 

liminf ^ '^"'^^ ' > I {a \ a^). 

n^+ao ri 
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Remark 6.10. Recall that a sequence of real valued functions defined on some metric 
space F-converges to some function /, if 

• for all X, lim„^+oo fnix) = fix), 

• for all sequence x„ converging to some x, lim inf„^_|_oo fn{xn) > f{x)- 

The preceding proposition can thus be restated by saying that for every 6„, G Be„ with e„ 

going to 0, the sequence of functions a ^ ° F-converges to a ^ I{a \ a^). 

It is well known that this kind of convergence is well adapted for deriving the convergence of 
minimizers. The next proposition illustrates this fact : 

Proposition 6.11. Suppose that for every n, Q"^ is an element of M'^q , then 
(6-12) Q^„,,„ - 



Proof. For all n, Q"* belongs to • Thus, using the minimization property of Q"^ 

one has ^^(Q^„,fe„ | Q^^^^J < I Q^cfeo)- According to point (1) of Proposition 

16.91 this implies that 

(6.13) limsup^//(Q-„,,„ I Q-„,,„) < I{a* \ a,). 

According to the point (2) of Proposition IHiHl Tn ^'^i- This set being compact, one can find 
some converging subsequence o"np. Let a be its limit. The point (2) of Proposition EHl yields 

(6-14) liminf ^mZ^^,^^ I QllJ > I ^o)- 

From (|6.13j) and (|6.14|) . one deduces that 

I <7o) < H^^* I f^o)- 

As a* is the unique minimizer of /(. | (Tq) under the moment constraint, one has a = a* . The 
point a* is thus the unique accumulation point of the compact sequence o"„. It follows that 
an converges to a*. Now, (|6.12|) follows immediately from Proposition 16.41 □ 

We are now ready to prove Theorem 16.61 

Proof of Theorem \6.fk First, we have the following immediate inequality 

dpM (k^o „ , Qa* , 6o ) < dpM (m^„ ^ , CO A^^"o ) + sup dpM (Q, Q<x* , feo ) • 

En 

Thus, according to Proposition 16.71 it suffices to prove that 

sup dpM (Q,Qa*.feo) ' 0- 

The application Q (ii^M (Q, Qo-'.bo) being convex and continuous, we get 

sup (ipM (Q,QfT*,f)o) = sup dpM {Q,Qa*,bo) ■ 
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But 7W"o is compact. Thus, there exists Q'J ^ G -M^o, such that 




sup dFM{Q,Qa*,bo) = dpM (Q^„,6„,Qa*,6o) • 



Applying Proposition 16.111 we get 



which achieves the proof. 



□ 



Before giving the proofs of Proposition 16.81 and lb. 91 let us do some comments on our result. 
Remark 6.15. 

• The reason why we work with T" instead of the more natural set T" = {Q" ^ : cr E 
T,i,b £ Bs} is that T" is of empty interior. The set was thus a bad candidate for 
defining a conditioning event in Gibbs Principle. In fact, from the relative entropy 
point of view, working with T" does not change anything : point (2) of Proposition 
16.81 shows that the entropy minimizers on are trinomial trees. 

• We introduced the set Si because some compactness is needed in Proposition 16.81 
Note that if we replace Si by Sq in the definition of T", this set becomes convex (see 
jl9n. In this framework, there is a unique entropy-minimizer Q"* .* . But we are not 
able to prove directly that the sequence cr* is compact. If this was true. Theorem 16.61 
would hold with Sq replacing Si. 

• The assumption that I(. | do) admits a unique minimizer under the moment con- 
straint is needed in the proof of Theorem 16.61 Namely, we used in the proof the fact 
that the function Q i-^ c^fm(Qi Qct*,6o) convex. If we were dealing with a set M 
of minimizers containing more than one element, this function would be replaced by 
the function Q i-^^ dpMiQj-^) which is no longer convex. 

6.2. Proofs. _ 

Proof of (1) of Proposition \6.8[ The set Cpp being clearly open, it suffices to show that T" 
is an open subset of Mi(fi„). First, it is easily seen that there is a constant c > depending 
only on amin, CTmax, bo, s and a such that 



for all Q G and ah \ j\ < k < n. For all \ j\ < < n and Q G Mi(17„), let us define 




(6.16) Fkjm 




n 





and 
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These applications are continuous on the open set 

) E Ml : V|i| <k<n, o(xk 
and the fohowing holds 



ja 

In 



> c 



V|j| < k < n 
V|g| < p < n 



Xfe >c, 



FkjiQ) e]bo-e,bo + e[, 

Gk,j{Q) ^]'^min^ '^maxi^ 

'p,qiQ)\ < 2A„,^. 



Iv^ 



\/Gp 



k — p\ 

n n I 



+ 



One easily concludes from this that T" is an open subset of M\(Vl^. 

Now let us show that B^A^ \ Q^^^hJ = HiJ^ \ Q2o,bo)- ^"0,60 ^ives a positive mass 
to every trajectory of O^, the convex function Mi(r2„) 3 H{Q \ Q"^ j^^) is everywhere 
finite thus continuous. As a consequence, H{0 \ Q"^ = H{0 \ Q"^ j,^) holds true for ah 
open set O of Mi(r2„). This is in particular true for A^. □ 

In order to prove the point (2) of Proposition 16.81 we need the following lemma. 



Lemma 6.18. For all a £ T,q, b G Bs, £ < s, let us define : 



and 

Then it holds : 
(6.19) 

(6.20) 



(t,x) = H{nU{t,x, .)\K.(t,x, .)). 



lib 



n-l 

n 



n-l 

H{Ql,\Ql^^,^) = Y,Kb 

i=0 



K,b;ao,bo [ 



Let Q be a probability measure satisfying 
{ (1) Q(Xo = 0) = l, 

(2) Q (Xt = Xk+ {nt - k) 



(6.21) 



X k+l — Xk 



Xi 



n — — n 



(3) Q Xp+i G 
/or some cr G Sq and b £ B^- Then 

(6.22) Vi = 0,...,n-1, /:q(x^) =£Qn^ (^X^) . 

Furthermore, 
(6.23) 



F(Q|Q^,,)+F(Q^,,|Q^„,,„ 



)• 
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Proof. The proofs of H6.19() . (|6.2U|) . ()6.22|) rely on very easy computations and are left to the 
reader. Let us prove ()6.23|) . It is clear that, 



(6.24) 

Next, we have 



H( 



H( 







d'^ao , bo J 





log 



n-1 

^lo, 

1=0 



n 



(ii) 



i=0 



€,b;ao,bo [ "'^^'^ 



n 



-,Xj_,dy 
n " 



n-l 



i=0 



'^a,b;ao,bo 



— ) 1. 

n n 



(iii) 



n-l 



4=0 



K,b;.o,bo{-^^^ 



(iv) 



where (i) follows from (|6.19|) . (ii) is obtained by conditioning by Xj, (iii) is a consequence of 
(IFT^ and (iv) of (ICTl) . Plugging this in itlHll) . we obtain (imi) . □ 

Proof of (2) of Proposition Let Q be in As Q belongs to A^, there exist a £ T,i 

and beB^ such that (lOTTl is fulfilled. According to K2^ . one has 

H(Q I 01 = H{q I Q^,,) +F(Q^^, I Q^^,,J. 



^(To,boJ 



If Q" b belongs to A", then we deduce from the preceding equation that H(A 
H{Q I fe) + F(Af I Qao,bo)' ™d consequently i7(Q | Q^^^) = 0, which implies that 
Q = ^. Thus, the only thing to do is to prove that ^ G 

Let (Qp)p be a sequence of going to Q. For each p, there is a pair (dp, 6p) G Si x such 
that H6.21() is fulfilled. For all |j| < /c < n, one has 

' k aj 



and 



n \ n 



k aj 



F, 



k,j{Vlp) 



n \ n 



where F^^j and G^j are defined by H6.16() and ()6.17|) . These functions being continuous, we 
have 



k aj 



k aj 



and 



0"„ 



k aj 



n 

k aj 
n' -v/n 



for all |j| < A; < n. It follows easily that 
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But according to H6.22() . 



Consequently, Q" ^ is in the closure of Al\ 

Proof of Provosition \6.!A Recall that for all a G Sq, I{a \ a^) is defined by 



□ 



I{a I fJo) 



q{a\t,Xt),alit,Xt))dt 



with 



X \ X 

j{x, y) = log ( - ^ + log 



2 

a — X 
— y 



1-^ 



(1) Let us show that there exists some K > depending only on a, amin-, c^max, &o and s, 
such that 



(6.25) 



a,b;ao,bo 



' \n J n 



for all (A;, x) G {0, . . . , n - 1} X -^Z and {a, 6) G Sq x fi. 
For all (cr, 6) G X : 



log 
log 
log 



m"'(cT, b) 



m"((To,&o) 



d"(ao,6o) 
r"((T, 6) 



?'"('7o,^o). 



d'^ia, b) 
r"(a,6) = lo, 



log 



+ log 1 + 



log ^ + log 1 



— an 



ba 
ba 





boa \ 


X 




VncTo) . 


2a2 




boa Y 


X 


- 




2a2 



+ 



Using Taylor's formula, it is easily seen that for e G {— 1, 1}, 



sup 

y&[ho-s,b(,+s\ 



log 1 + 



eya 



eya ^1 ( eya 



^Jnx ) y/nx 2 \ y/nx 



< 



K 



n\ n 



with K depending only on a, c^axj ^mim bo et s. 

After some easy computations, one derives (|6.25|) from these inequalities. 



In the sequel we will use the following notations 

71-1 



and 



1 



n 



i=0 



a 



n n ' 



,X i , cr. 



n " 



ouson [CT^i„,cJ^ax]^- 

bounded continuous functions on fi, which converges pointwise to the bounded continuous 



The function q is bounded and continuous on CTmax]'^- is thus a sequence of uniformly 
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function $. Let us show that converges uniformly to $ on every compact subset of fi. 
The function q is Lipschitz on [Cmm) ^mai:]^ ; let M > be such that 

\q{x,y)-q{x',y')\ < M{\x - x'\ + \y - y'\). 

Let A be the continuity modulus of a"^, ie. 

A(u) = sup \a^{s,x) - a^{t,y)\, 

\t— s\ + \y— x\<u 

and Aq the continuity modulus of ag. 
With these notations, we have 



n-l 



n 

i=0 
n-l 



-Y,q(aU^,X,),aU-,X, 

n. ^ — ' \ \ TL n I \ n. ri 



n n 



i=0 



q a 



n n 



,Xi_ ] ,cj| 



T' 



n n 



q{a\t,Xt),al{t,Xt))dt 
q{a\t,Xt),a^oit,Xt)) 



n-l 



i=0 



a- I ) -a^{t,Xt 

n n 



+ 



dt 



dt 



< M 



< M 



< M 



sup |cj2(s,X,) + sup \a^{s,Xs)-a^{t,Xt)\ 
sup A{\s-t\ + \Xs- Xt\)+ sup Aq {\s - t\ + \X, - Xt\ 



1 



1 



A -+ sup \Xs-Xt\\ +Ao\ -+ sup \Xs-X, 



n 



n 



Let /C be a compact subset of ft. According to Ascoli Theorem, we have 

sup sup \Xs — Xt\ > 0. 



Thus 



According to (|6.25() 



sup - ^{ui)\ > 

uj&JC n^+co 



0. 



< 



K 



n 



where K depends only on a, fimax) Cmm, and s. Using the uniform convergence of ($")n 
on every compact and the tightness of the sequence Q" , it is now easy to see that 



hm -HiQ^ 

n^oo n ' 



fen I ^ao,bo^ 



I{a I ao). 
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According to H6.19() of Lemma 16.181 



1 

n 



f / 









where 



A:" = loe 



m"(cr, hr, 



It is easily seen that there is a constant K depending only on a, amin, (^max-, &o and s such 
that 

Vi?>0, sup |A;"-/i^,;,,^.,^,feJ(t,x) sup |a„ - cT|(t, x). 



|z|<_R,tG[0,l] 



|z|<_R,tG[0,l] 



The sequence Q"^ converging to Qo-, fe, it is a tight sequence. As a consequence, for all 
/? > 0, there is i? > such that 



I sup |Xi|<i? >l-/3. 

V*6[o.i] y 

One can find M > depending on a, cimmi CTmax, &o and s, such that |A;"| < M and 
hi 



cr, fe„ ; (TO , feo 



< M. Thus, 



i=0 ^ 



E" 

iTn , On 



n-1 



i=0 



n 1 



i=0 



,^,,^-fc"|l[o,,?](sup 

t6[0,l] 



+ 2M(1 - (5) 



<K sup |o-„-a|(t,a;) + 2M(l-/3). 
\x\<R, te[o,i] 

One easily concludes that 

n— 1 



E' 



1 



n 



j=0 



n n 



E: 



n 



n-1 



; o"0i 



j=0 



n n 



0. 
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A similar reasoning as in the proof of point (1) shows that 



Tn, On 



i=0 

which achieves the proof. □ 
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